In this report we are going to analyze the data set from the center of policing Equity (CPE) is a consortium of research scientists that promotes the police transparency and accountability with the help of innovation and collaboration between law enforcement agencies and the communities they serve. This is the collection of standardized police behavioral data. In this report we will try to find out the problems in the systems, such as racism in the police department. And try to find some answers and extract some insights after doing some visualization. The ultimate goal is to inform police agencies where they can make improvements by identifying deployment areas where racial disparities exist and are not explainable by crime rates and poverty levels.
We will remove the values that contain the Na’s values more than 60% in each column. And for other column which contain the Na’s values less than 10% we will going to them with the median values.As we can see the variables that are available to use for the visualization are included in the table below .After cleaning the data set we have a total of 2383 rows and 38 column left in the data set. With 13 columns as the continuous variable and 25 are the categorical variable.
| vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| INCIDENT_DATE | 1 | 2383 | NaN | NA | NA | NaN | NA | Inf | -Inf | -Inf | NA | NA | NA |
| INCIDENT_TIME* | 2 | 2383 | 270.839698 | 1.595198e+02 | 275.00000 | 270.743052 | 209.0466000 | 1.00000 | 543.00000 | 5.42000e+02 | 0.0094140 | -1.2458302 | 3.2677785 |
| UOF_NUMBER* | 3 | 2383 | 1163.611834 | 6.714928e+02 | 1166.00000 | 1163.887258 | 859.9080000 | 1.00000 | 2328.00000 | 2.32700e+03 | -0.0039594 | -1.2017925 | 13.7555931 |
| OFFICER_ID | 4 | 2383 | 9572.125472 | 1.534883e+03 | 10115.00000 | 9823.988988 | 1083.7806000 | 0.00000 | 11170.00000 | 1.11700e+04 | -1.6462990 | 3.8716754 | 31.4422288 |
| OFFICER_GENDER* | 5 | 2383 | 1.899287 | 3.010120e-01 | 2.00000 | 1.998951 | 0.0000000 | 1.00000 | 2.00000 | 1.00000e+00 | -2.6518482 | 5.0344118 | 0.0061663 |
| OFFICER_RACE* | 6 | 2383 | 5.045741 | 1.285130e+00 | 6.00000 | 5.219192 | 0.0000000 | 1.00000 | 6.00000 | 5.00000e+00 | -0.8456296 | -0.7470492 | 0.0263260 |
| OFFICER_HIRE_DATE* | 7 | 2383 | 150.966009 | 8.608945e+01 | 169.00000 | 152.265338 | 100.8168000 | 1.00000 | 291.00000 | 2.90000e+02 | -0.1429733 | -1.2386832 | 1.7635506 |
| OFFICER_YEARS_ON_FORCE | 8 | 2383 | 8.049098 | 7.562481e+00 | 6.00000 | 6.646565 | 5.9304000 | 0.00000 | 36.00000 | 3.60000e+01 | 1.4833403 | 1.4677128 | 0.1549181 |
| OFFICER_INJURY* | 9 | 2383 | 1.098196 | 2.976413e-01 | 1.00000 | 1.000000 | 0.0000000 | 1.00000 | 2.00000 | 1.00000e+00 | 2.6987908 | 5.2856902 | 0.0060972 |
| OFFICER_INJURY_TYPE* | 10 | 2383 | 49.748216 | 1.048442e+01 | 52.00000 | 52.000000 | 0.0000000 | 1.00000 | 76.00000 | 7.50000e+01 | -3.7374257 | 14.3606302 | 0.2147744 |
| OFFICER_HOSPITALIZATION* | 11 | 2383 | 1.020143 | 1.405177e-01 | 1.00000 | 1.000000 | 0.0000000 | 1.00000 | 2.00000 | 1.00000e+00 | 6.8269807 | 44.6263931 | 0.0028785 |
| SUBJECT_ID | 12 | 2383 | 40255.407889 | 1.240602e+04 | 44573.00000 | 43652.611956 | 2043.0228000 | 0.00000 | 47972.00000 | 4.79720e+04 | -2.4671205 | 4.8364561 | 254.1384834 |
| SUBJECT_RACE* | 13 | 2383 | 4.052455 | 1.542627e+00 | 3.00000 | 3.819612 | 0.0000000 | 1.00000 | 7.00000 | 6.00000e+00 | 1.2030960 | -0.2228576 | 0.0316009 |
| SUBJECT_GENDER* | 14 | 2383 | 1.820394 | 3.979000e-01 | 2.00000 | 1.894075 | 0.0000000 | 1.00000 | 4.00000 | 3.00000e+00 | -1.3654857 | 1.0723672 | 0.0081510 |
| SUBJECT_INJURY* | 15 | 2383 | 1.263953 | 4.408666e-01 | 1.00000 | 1.205034 | 0.0000000 | 1.00000 | 2.00000 | 1.00000e+00 | 1.0703824 | -0.8546395 | 0.0090312 |
| SUBJECT_INJURY_TYPE* | 16 | 2383 | 108.264373 | 4.129537e+01 | 122.00000 | 115.830100 | 0.0000000 | 1.00000 | 193.00000 | 1.92000e+02 | -1.6392554 | 2.0976724 | 0.8459397 |
| SUBJECT_WAS_ARRESTED* | 17 | 2383 | 1.859421 | 3.476598e-01 | 2.00000 | 1.949135 | 0.0000000 | 1.00000 | 2.00000 | 1.00000e+00 | -2.0667910 | 2.2725791 | 0.0071218 |
| SUBJECT_DESCRIPTION* | 18 | 2383 | 9.242551 | 5.211150e+00 | 11.00000 | 9.552701 | 4.4478000 | 1.00000 | 15.00000 | 1.40000e+01 | -0.6223513 | -1.2286417 | 0.1067509 |
| SUBJECT_OFFENSE* | 19 | 2383 | 250.870331 | 1.769465e+02 | 304.00000 | 248.448873 | 220.9074000 | 1.00000 | 551.00000 | 5.50000e+02 | -0.0999534 | -1.3374383 | 3.6247660 |
| REPORTING_AREA | 20 | 2383 | 3190.562736 | 1.936015e+03 | 2231.00000 | 2944.816466 | 1700.5422000 | 1001.00000 | 9611.00000 | 8.61000e+03 | 1.1669871 | 1.4928651 | 39.6594516 |
| BEAT | 21 | 2383 | 392.772556 | 2.104614e+02 | 351.00000 | 383.353959 | 292.0722000 | 111.00000 | 757.00000 | 6.46000e+02 | 0.2649576 | -1.2814252 | 4.3113212 |
| SECTOR | 22 | 2383 | 389.022241 | 2.105877e+02 | 350.00000 | 379.480860 | 296.5200000 | 110.00000 | 750.00000 | 6.40000e+02 | 0.2677882 | -1.2869836 | 4.3139098 |
| DIVISION* | 23 | 2383 | 3.688208 | 2.137520e+00 | 3.00000 | 3.610383 | 2.9652000 | 1.00000 | 7.00000 | 6.00000e+00 | 0.1453711 | -1.4108341 | 0.0437873 |
| LOCATION_DISTRICT* | 24 | 2383 | 7.785145 | 3.652288e+00 | 7.00000 | 7.841636 | 4.4478000 | 1.00000 | 14.00000 | 1.30000e+01 | -0.0682703 | -1.0446330 | 0.0748175 |
| STREET_NUMBER | 25 | 2383 | 4903.800671 | 4.532293e+03 | 3415.00000 | 4297.367593 | 3536.0010000 | 0.00000 | 54023.00000 | 5.40230e+04 | 2.3118702 | 12.7090834 | 92.8444450 |
| STREET_NAME* | 26 | 2383 | 530.386488 | 3.026965e+02 | 524.00000 | 527.276350 | 381.0282000 | 1.00000 | 1080.00000 | 1.07900e+03 | 0.0787616 | -1.1560959 | 6.2007659 |
| STREET_DIRECTION* | 27 | 2383 | 2.986572 | 7.621312e-01 | 3.00000 | 2.981647 | 0.0000000 | 1.00000 | 5.00000 | 4.00000e+00 | 0.0395708 | 2.3493465 | 0.0156123 |
| STREET_TYPE* | 28 | 2383 | 12.185481 | 6.737242e+00 | 13.00000 | 12.456738 | 8.8956000 | 1.00000 | 22.00000 | 2.10000e+01 | -0.2966993 | -1.4051343 | 0.1380131 |
| LOCATION_FULL_STREET_ADDRESS_OR_INTERSECTION* | 29 | 2383 | 642.278640 | 3.961413e+02 | 631.00000 | 638.865758 | 520.3926000 | 1.00000 | 1322.00000 | 1.32100e+03 | 0.0558407 | -1.2824814 | 8.1149925 |
| LOCATION_CITY* | 30 | 2383 | 1.000000 | 0.000000e+00 | 1.00000 | 1.000000 | 0.0000000 | 1.00000 | 1.00000 | 0.00000e+00 | NaN | NaN | 0.0000000 |
| LOCATION_STATE* | 31 | 2383 | 1.000000 | 0.000000e+00 | 1.00000 | 1.000000 | 0.0000000 | 1.00000 | 1.00000 | 0.00000e+00 | NaN | NaN | 0.0000000 |
| LOCATION_LATITUDE | 32 | 2383 | 32.801958 | 8.529450e-02 | 32.78406 | 32.796274 | 0.0764651 | 32.63318 | 33.01519 | 3.82007e-01 | 0.6003629 | -0.1638629 | 0.0017473 |
| LOCATION_LONGITUDE | 33 | 2383 | -96.783915 | 6.431190e-02 | -96.79111 | -96.785531 | 0.0489050 | -96.95503 | -96.57442 | 3.80608e-01 | 0.2897695 | -0.0177765 | 0.0013174 |
| INCIDENT_REASON* | 34 | 2383 | 5.850608 | 4.359516e+00 | 3.00000 | 5.544835 | 1.4826000 | 1.00000 | 14.00000 | 1.30000e+01 | 0.4032136 | -1.6900293 | 0.0893051 |
| REASON_FOR_FORCE* | 35 | 2383 | 4.963911 | 3.340519e+00 | 3.00000 | 4.624541 | 2.9652000 | 1.00000 | 12.00000 | 1.10000e+01 | 0.7544860 | -0.6290041 | 0.0684308 |
| TYPE_OF_FORCE_USED1* | 36 | 2383 | 21.129249 | 8.794610e+00 | 27.00000 | 22.242790 | 2.9652000 | 1.00000 | 29.00000 | 2.80000e+01 | -0.7590019 | -1.0215376 | 0.1801584 |
| NUMBER_EC_CYCLES* | 37 | 2383 | 11.419220 | 2.245828e+00 | 12.00000 | 12.000000 | 0.0000000 | 1.00000 | 12.00000 | 1.10000e+01 | -3.7111750 | 12.0582682 | 0.0460060 |
| FORCE_EFFECTIVE* | 38 | 2383 | 59.353756 | 2.324481e+01 | 69.00000 | 61.410593 | 20.7564000 | 1.00000 | 104.00000 | 1.03000e+02 | -0.6866249 | -0.2577183 | 0.4761721 |
In this graph we are looking at the different race of police officers who stayed at different locations and how many years they have spent on that location. It gives us the information regarding the behaviors of police officers according to their location, and also tells us about if the particular race of officers tens to stay longer than the other race of officer on the particular location. We can see that the most spread out population of officers is of “American Ind” this can be due to the fact they are limited in number on the force , and we can see the central part of the area is dominate by the white police officers.
A boxplot is a standardized way of displaying the distribution of the data set,it tells about the distribution of officers of different race and how many time they have spent on the force.It tells us about the information regarding the outliers and spread. As we can see that Police officers with the racial background as white , tends of spend much longer time on the force then the other police officers, and at the same time we see many outliers who spent more time on the force then the normal officers.Though this pattern also matches with the officers with the racial backgroud as “Hispanic”.Most of the point in both the Race lies out the area of the normal distribution. Whereas it is not the case with the other officers.
In this graph we will going to look at the which sector has what kinds of the incident reason recorded. We will going to look at the tile chart which can provide the clear representation of the distribution of those calls along the different sectors. As we can see the “Traffic Stop” is the most common Incident that has occurred among the all sector , we can see it color distribution ranging from light to dark. The least number of calls is for the “Crowd Control” and “Accidental Discharge”. Whereas “Arrest” and “call for Cover” are the most called incident reasons.
## Warning in matrix(g$fill_plotlyDomain, nrow = length(y), ncol = length(x), :
## data length [2603] is not a sub-multiple or multiple of the number of rows [14]
## Warning in matrix(g$hovertext, nrow = length(y), ncol = length(x), byrow =
## TRUE): data length [2603] is not a sub-multiple or multiple of the number of
## rows [14]
Facet grid helps us to see multiple graphs at a one time and we can compare easily between different categories.In this graph we have tried to find out regarding the injury pattern of different officers over the years of their duty according to their gender and race.We have found out that their is positive relation between the between the age for the female white officers and how much time they spent on force,and it is same for the black female and male officers.In case for the asian female officers we did’nt find any trend realted to injury, but we can see that there is trend of injury with the asian male officers.
<ScaleContinuousPosition>
Range:
Limits: 0 -- 1
In this graph we are plotting different categorical values against each other and trying to find the relationship between them. We have the plotted the subject race against subject gender with respect to reporting area. We found that the male subject with different races reported across all the reporting areas,Whereas similar pattern can be seen with the females of black and white gender in terms of their reporting area.But it is not the case with the Hispanic female, for them we can see they have reported in all the areas.It is surprising to see that no visualization can be found for the “American Ind” and “Asian” females.
In this next graph we have plotted a combination of box plot and the scatter plot to show the distribution of injury among different genders and spread across multiple sectors. As the data set contains a total of 440 females, 1932 males ,10 null and 1 unknown value in it. We can see that the distribution of both male and females are equally spread out across the sector for injury , with male streaching a little more than women.
In this chart we have tried to find out the timeline of crimes that has occurred across the time period of our data set. In order to make the visualization under stable, a range of values are selected from the data set.As we can see ,we got the peak amount of crimes in the month of June,October,March and August in a descending order respectively. For rest of the year the crime rate stays mostly the same with an exception of the starting months of the year, where the crime rate stays relatively low compared to other months of the year.We can also see the activity decreasing as we move toward the end of the year.
Next,We will going to look at the subject description with respect to subject gender.As we can see the subject description with “Mentally unstable” has highest amount of females in it, followed by “Unknown” and “Alcohol”,whereas the male category is dominated by the most suspect description of “Alcohol”,“unknown”,“unknown Drugs”followed by “Mentally Unstable”. Motor Vehicle have the lowest amount amount of subject description followed by the “Gun” and “weapon”.
Here,we have plotted a correlation matrix of the columns which contain the details of the officers ,and tried to find out if there is any common point among the information of the officers.As we can see the male officers tends to stay longer on force than the female officers and the among of injury among the male officers is also higher than the females officers.It is interesting to find that on the starting years on the force for officers the level of injury is higher ,it means the younger officers tends to get more injured , but as the years on force increases then trend of injury decreases.
Here the pie chart represents the percentage of the subjects according to their racial background.We can see that the biggest percentage area is coved by the subjects where their racial identity is “Black” which is around 55%, the next biggest area is covered by the subjects whose racial identity is “white” which is around 20%. and rest all the subjects with different racial identities stayes below the 15%.